23 research outputs found

    Asynchronous Teams and Tasks in a Message Passing Environment

    Get PDF
    As the discipline of scientific computing grows, so too does the "skills gap" between the increasingly complex scientific applications and the efficient algorithms required. Increasing demand for computational power on the march towards exascale requires innovative approaches. Closing the skills gap avoids the many pitfalls that lead to poor utilisation of resources and wasted investment. This thesis tackles two challenges: asynchronous algorithms for parallel computing and fault tolerance. First I present a novel asynchronous task invocation methodology for Discontinuous Galerkin codes called enclave tasking. The approach modifies the parallel ordering of tasks that allows for efficient scaling on dynamic meshes up to 756 cores. It ensures high levels of concurrency and intermixes tasks of different computational properties. Critical tasks along domain boundaries are prioritised for an overlap of computation and communication. The second contribution is the teaMPI library, forming teams of MPI processes exchanging consistency data through an asynchronous "heartbeat". In contrast to previous approaches, teaMPI operates fully asynchronously with reduced overhead. It is also capable of detecting individually slow or failing ranks and inconsistent data among replicas. Finally I provide an outlook into how asynchronous teams using enclave tasking can be combined into an advanced team-based diffusive load balancing scheme. Both concepts are integrated into and contribute towards the ExaHyPE project, a next generation code that solves hyperbolic equation systems on dynamically adaptive cartesian grids

    Balance Assessment Using a Smartwatch Inertial Measurement Unit with Principal Component Analysis for Anatomical Calibration

    Get PDF
    Balance assessment, or posturography, tracks and prevents health complications for a variety of groups with balance impairment, including the elderly population and patients with traumatic brain injury. Wearables can revolutionize state-of-the-art posturography methods, which have recently shifted focus to clinical validation of strictly positioned inertial measurement units (IMUs) as replacements for force-plate systems. Yet, modern anatomical calibration (i.e., sensor-to-segment alignment) methods have not been utilized in inertial-based posturography studies. Functional calibration methods can replace the need for strict placement of inertial measurement units, which may be tedious or confusing for certain users. In this study, balance-related metrics from a smartwatch IMU were tested against a strictly placed IMU after using a functional calibration method. The smartwatch and strictly placed IMUs were strongly correlated in clinically relevant posturography scores (r = 0.861–0.970, p \u3c 0.001). Additionally, the smartwatch was able to detect significant variance (p \u3c 0.001) between pose-type scores from the mediolateral (ML) acceleration data and anterior-posterior (AP) rotation data. With this calibration method, a large problem with inertial-based posturography has been addressed, and wearable, “at-home” balance-assessment technology is within possibility

    teaMPI---replication-based resiliency without the (performance) pain.

    Get PDF
    In an era where we can not afford to checkpoint frequently, replication is a generic way forward to construct numerical simulations that can continue to run even if hardware parts fail. Yet, replication often is not employed on larger scales, as naïvely mirroring a computation once effectively halves the machine size, and as keeping replicated simulations consistent with each other is not trivial. We demonstrate for the ExaHyPE engine—a task-based solver for hyperbolic equation systems—that it is possible to realise resiliency without major code changes on the user side, while we introduce a novel algorithmic idea where replication reduces the time-to-solution. The redundant CPU cycles are not burned “for nothing”. Our work employs a weakly consistent data model where replicas run independently yet inform each other through heartbeat messages whether they are still up and running. Our key performance idea is to let the tasks of the replicated simulations share some of their outcomes, while we shuffle the actual task execution order per replica. This way, replicated ranks can skip some local computations and automatically start to synchronise with each other. Our experiments with a production-level seismic wave-equation solver provide evidence that this novel concept has the potential to make replication affordable for large-scale simulations in high-performance computing

    Volume 12

    Get PDF
    Introduction, Dr. Roger A. Byrne, Dean From the Editor, Dr. Larissa Kat Tracy From the Designers, Rachel English, Rachel Hanson Immortality in the Mortal World: Otherworldly Intervention in Lanval and The Wife of Bath\u27s Tale by Haleigh James Analysis of Phenolic Compounds in Moroccan Olive Oils by HPLC by Hannah Meyls Art by Hope Irvin The Effects of Cell Phone Use on Gameplay Enjoyment and Frustration by Megan E. Hlavaty, Samara L. Gall, and Austin J. Funk Care, No Matter What: Planned Parenthood\u27s Use of Organizational Rhetoric to Expand its Reputation by Karyn Keane Analysis of Petroleum Products for Forensic and Environmental Applications by Sarah Ghali, Antonio Harvey, and Katelynn McCrillis Art by Andrew Jones The Triangle Shirtwaist Factory Fire by Rachel Hazelwood Art by Madison Schmitz Ercilla y la imitacion: Araucanos al estilo europeo by Marija Venta Design by Haley Tebo Design by Jeremiah Gilmer White Supremacist\u27s Appropriation of the Persuasion of Passivity in Marvel\u27s Captain America by Bridget Dunn Design by Benjamin Sullivan Art by McKenzie Johnso

    The structure of the C-terminal actin-binding domain of talin

    Get PDF
    Talin is a large dimeric protein that couples integrins to cytoskeletal actin. Here, we report the structure of the C-terminal actin-binding domain of talin, the core of which is a five-helix bundle linked to a C-terminal helix responsible for dimerisation. The NMR structure of the bundle reveals a conserved surface-exposed hydrophobic patch surrounded by positively charged groups. We have mapped the actin-binding site to this surface and shown that helix 1 on the opposite side of the bundle negatively regulates actin binding. The crystal structure of the dimerisation helix reveals an antiparallel coiled-coil with conserved residues clustered on the solvent-exposed face. Mutagenesis shows that dimerisation is essential for filamentous actin (F-actin) binding and indicates that the dimerisation helix itself contributes to binding. We have used these structures together with small angle X-ray scattering to derive a model of the entire domain. Electron microscopy provides direct evidence for binding of the dimer to F-actin and indicates that it binds to three monomers along the long-pitch helix of the actin filament

    Enclave Tasking for DG Methods on Dynamically Adaptive Meshes

    Get PDF
    High-order discontinuous Galerkin (DG) methods promise to be an excellent discretization paradigm for hyperbolic differential equation solvers running on supercomputers, since they combine high arithmetic intensity with localized data access, since they straightforwardly translate into nonoverlapping domain decomposition, and since they facilitate dynamic adaptivity without the need for conformal meshes. An efficient parallel evaluation of DG weak formulation in an MPI+X setting, however, remains nontrivial as dependency graphs over dynamically adaptive meshes change with each mesh refinement or coarsening, as resolution transitions yield nontrivial data flow dependencies, and as data sent along domain boundaries through message passing (MPI) have to be triggered in the correct order. Domain decomposition with MPI alone starts to become insufficient if the mesh changes very frequently, if mesh changes cannot be predicted, and if limiters and nonlinear per-cell solves yield unpredictable costs per cell. We introduce enclave tasking as a task invocation technique for shared memory and MPI+X: It does not assemble any task graph; instead the mesh traversal spawns ready tasks directly. A marker-and-cell approach ensures that tasks feeding into MPI or triggering mesh modifications as well as latency-sensitive or bandwidth-demanding tasks are processed with high priority. The remaining cell tasks form enclaves, i.e., groups of tasks that can be processed in the background. Enclave tasking introduces high concurrency which is homogeneously distributed over the mesh traversal, it mixes memory-intensive volumetric DG calculations with compute-bound Riemann solves, and it helps to overlap communication with computations. Our work focuses on ADER-DG and patch-based finite volumes. Yet, we discuss how the paradigm can be generalized to the whole DG family and finite volume stand-alone solvers

    teaMPI---replication-based resiliency without the (performance) pain

    Get PDF
    In an era where we can not afford to checkpoint frequently, replication is a generic way forward to construct numerical simulations that can continue to run even if hardware parts fail. Yet, replication often is not employed on larger scales, as naïvely mirroring a computation once effectively halves the machine size, and as keeping replicated simulations consistent with each other is not trivial. We demonstrate for the ExaHyPE engine—a task-based solver for hyperbolic equation systems—that it is possible to realise resiliency without major code changes on the user side, while we introduce a novel algorithmic idea where replication reduces the time-to-solution. The redundant CPU cycles are not burned “for nothing”. Our work employs a weakly consistent data model where replicas run independently yet inform each other through heartbeat messages whether they are still up and running. Our key performance idea is to let the tasks of the replicated simulations share some of their outcomes, while we shuffle the actual task execution order per replica. This way, replicated ranks can skip some local computations and automatically start to synchronise with each other. Our experiments with a production-level seismic wave-equation solver provide evidence that this novel concept has the potential to make replication affordable for large-scale simulations in high-performance computing
    corecore